Method for Retrieving a Similar Sentence and Its Application to Machine Translation
نویسندگان
چکیده
In this paper, we propose incorporating similar sentence retrieval in machine translation to improve the translation of hard-to-translate input sentences. If a given input sentence is hard to translate, a sentence similar to the input sentence is retrieved from a monolingual corpus of translatable sentences and then provided to the MT system instead of the original sentence. This method is advantageous in that it relies only on a monolingual corpus. The similarity between an input sentence and each sentence in the corpus is determined from the ratio of the common N-gram. We use two conditions to improve the retrieval precision and add a filtering method to avoid inappropriate sentences. An experiment using a Japanese-to-English MT system in a travel conversation domain proves that our method improves the translation quality of hard-to-translate input sentences by 9.8 %.
منابع مشابه
A Hybrid Machine Translation System Based on a Monotone Decoder
In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...
متن کاملRetrieving Meaning-equivalent Sentences for Example-based Rough Translation
Example-based machine translation (EBMT) is a promising translation method for speechto-speech translation because of its robustness. It retrieves example sentences similar to the input and adjusts their translations to obtain the output. However, it has problems in that the performance degrades when input sentences are long and when the style of inputs and that of the example corpus are differ...
متن کاملThe Effect of Using Keyword Method on EFL Learners' Learning and Retrieving English Verb Types
This study used keyword method during encoding information in transferring information from short term memory to make the retrieval easier. For this purpose, 50 adult female elementary students were chosen to participate in this study. This study required two groups of learners (control and experimental groups). The experimental group enjoyed some special flashcards which each of them involved ...
متن کاملIdentification of Divergence for English to Hindi EBMT
Divergence is a key aspect of translation between two languages. Divergence occurs when structurally similar sentences of the source language do not translate into sentences that are similar in structures in the target language. Divergence assumes special significance in the domain of Example-Based Machine Translation (EBMT). An EBMT system generates translation of a given sentence by retrievin...
متن کاملاستخراج پیکره موازی از اسناد قابلمقایسه برای بهبود کیفیت ترجمه در سیستمهای ترجمه ماشینی
Data used for training statistical machine translation method are usually prepared from three resources: parallel, non-parallel and comparable text corpora. Parallel corpora are an ideal resource for translation but due to lack of these kinds of texts, non-parallel and comparable corpora are used either for parallel text extraction. Most of existing methods for exploiting comparable corpora loo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004